NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Downstream Task Guided Masking Learning in Masked Autoencoders Using Multi-Level Optimization

Guo, Han; Hosseini, Ramtin; Zhang, Ruiyi; Somayajula, Sai Ashish; Chowdhury, Ranak Roy; Gupta, Rajesh K; Xie, Pengtao (April 2025, Transactions on machine learning research)

Masked Autoencoder (MAE) is a notable method for self-supervised pretraining in visual representation learning. It operates by randomly masking image patches and reconstructing these masked patches using the unmasked ones. A key limitation of MAE lies in its disregard for the varying informativeness of different patches, as it uniformly selects patches to mask. To overcome this, some approaches propose masking based on patch informativeness. However, these methods often do not consider the specific requirements of downstream tasks, potentially leading to suboptimal representations for these tasks. In response, we introduce the Multi-level Optimized Mask Autoencoder (MLO-MAE), a novel framework that leverages end-to-end feedback from downstream tasks to learn an optimal masking strategy during pretraining. Our experimental findings highlight MLO-MAE's significant advancements in visual representation learning. Compared to existing methods, it demonstrates remarkable improvements across diverse datasets and tasks, showcasing its adaptability and efficiency. Our code is available at https://github.com/Alexiland/MLO-MAE
more » « less
Free, publicly-accessible full text available April 11, 2026
Unleashing the Power of Shared Label Structures for Human Activity Recognition

https://doi.org/10.1145/3583780.3615101

Zhang, Xiyuan; Chowdhury, Ranak Roy; Zhang, Jiayun; Hong, Dezhi; Gupta, Rajesh K; Shang, Jingbo (October 2023, CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management)
SQEE: A Machine Perception Approach to Sensing Quality Evaluation at the Edge by Uncertainty Quantification

https://doi.org/10.1145/3560905.3568534

Li, Shuheng; Shang, Jingbo; Gupta, Rajesh K.; Hong, Dezhi (November 2022, Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems)

Cyber-physical systems are starting to adopt neural network (NN) models for a variety of smart sensing applications. While several efforts seek better NN architectures for system performance improvement, few attempts have been made to study the deployment of these systems in the field. Proper deployment of these systems is critical to achieving ideal performance, but the current practice is largely empirical via trials and errors, lacking a measure of quality. Sensing quality should reflect the impact on the performance of NN models that drive machine perception tasks. However, traditional approaches either evaluate statistical difference that exists objectively, or model the quality subjectively via human perception. In this work, we propose an efficient sensing quality measure requiring limited data samples using smart voice sensing system as an example. We adopt recent techniques in uncertainty evaluation for NN to estimate audio sensing quality. Intuitively, a deployment at better sensing location should lead to less uncertainty in NN predictions. We design SQEE, Sensing Quality Evaluation at the Edge for NN models, which constructs a model ensemble through Monte-Carlo dropout and estimates posterior total uncertainty via average conditional entropy. We collected data from three indoor environments, with a total of 148 transmitting-receiving (t-r) locations experimented and more than 7,000 examples tested. SQEE achieves the best performance in terms of the top-1 ranking accuracy---whether the measure finds the best spot for deployment, in comparison with other uncertainty strategies. We implemented SQEE on a ReSpeaker to study SQEE's real-world efficacy. Experimental result shows that SQEE can effectively evaluate the data collected from each t-r location pair within 30 seconds and achieve an average top-3 ranking accuracy of over 94%. We further discuss generalization of our framework to other sensing schemes.
more » « less
Full Text Available
TARNet: Task-Aware Reconstruction for Time-Series Transformer

https://doi.org/10.1145/3534678.3539329

Chowdhury, Ranak Roy; Zhang, Xiyuan; Shang, Jingbo; Gupta, Rajesh K.; Hong, Dezhi (August 2022, Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining)

Full Text Available
Local Binary Pattern Networks

https://doi.org/10.1109/WACV45572.2020.9093550

Lin, Jeng-Hau; Lazarow, Justin; Yang, Yunfan; Hong, Dezhi; Gupta, Rajesh K.; Tu, Zhuowen (March 2020, IEEE Winter Conference on Applications of Computer Vision)

Emerging edge devices such as sensor nodes are increasingly being tasked with non-trivial tasks related to sensor data processing and even application-level inferences from this sensor data. These devices are, however, extraordinarily resource-constrained in terms of CPU power (often Cortex M0-3 class CPUs), available memory (in few KB to MBytes), and energy. Under these constraints, we explore a novel approach to character recognition using local binary pattern networks, or LBPNet, that can learn and perform bit-wise operations in an end-to-end fashion. LBPNet has its advantage for characters whose features are composed of structured strokes and distinctive outlines. LBPNet uses local binary comparisons and random projections in place of conventional convolution (or approximation of convolution) operations, providing an important means to improve memory efficiency as well as inference speed. We evaluate LBPNet on a number of character recognition benchmark datasets as well as several object classification datasets and demonstrate its effectiveness and efficiency.
more » « less
Full Text Available
Serving deep neural networks at the cloud edge for vision applications on mobile platforms

https://doi.org/10.1145/3304109.3306221

Fang, Zhou; Hong, Dezhi; Gupta, Rajesh K. (January 2019, ACM Multimedia Systems Conference (MMSys) 2019)

Full Text Available
Critical Risk Indicators (CRIs) for the electric power grid: a survey and discussion of interconnected effects

https://doi.org/10.1007/s10669-021-09822-2

Che-Castaldo, Judy P.; Cousin, Rémi; Daryanto, Stefani; Deng, Grace; Feng, Mei-Ling E.; Gupta, Rajesh K.; Hong, Dezhi; McGranaghan, Ryan M.; Owolabi, Olukunle O.; Qu, Tianyi; et al (July 2021, Environment Systems and Decisions)
null (Ed.)
The electric power grid is a critical societal resource connecting multiple infrastructural domains such as agriculture, transportation, and manufacturing. The electrical grid as an infrastructure is shaped by human activity and public policy in terms of demand and supply requirements. Further, the grid is subject to changes and stresses due to diverse factors including solar weather, climate, hydrology, and ecology. The emerging interconnected and complex network dependencies make such interactions increasingly dynamic, posing novel risks, and presenting new challenges to manage the coupled human–natural system. This paper provides a survey of models and methods that seek to explore the significant interconnected impact of the electric power grid and interdependent domains. We also provide relevant critical risk indicators (CRIs) across diverse domains that may be used to assess risks to electric grid reliability, including climate, ecology, hydrology, finance, space weather, and agriculture. We discuss the convergence of indicators from individual domains to explore possible systemic risk, i.e., holistic risk arising from cross-domain interconnections. Further, we propose a compositional approach to risk assessment that incorporates diverse domain expertise and information, data science, and computer science to identify domain-specific CRIs and their union in systemic risk indicators. Our study provides an important first step towards data-driven analysis and predictive modeling of risks in interconnected human–natural systems.
more » « less
Full Text Available
SnaPEA: Predictive Early Activation for Reducing Computation in Deep Convolutional Neural Networks

https://doi.org/10.1109/ISCA.2018.00061

Akhlaghi, Vahideh; Yazdanbakhsh, Amir; Samadi, Kambiz; Gupta, Rajesh K.; Esmaeilzadeh, Hadi (June 2018, ISCA)

Full Text Available
Brick : Metadata schema for portable smart building applications

https://doi.org/10.1016/j.apenergy.2018.02.091

Balaji, Bharathan; Bhattacharya, Arka; Fierro, Gabriel; Gao, Jingkun; Gluck, Joshua; Hong, Dezhi; Johansen, Aslak; Koh, Jason; Ploennigs, Joern; Agarwal, Yuvraj; et al (September 2018, Applied Energy)

Full Text Available

Search for: All records